
APPARATUS AND METHOD FOR OPERATING A MASTER-SLAVE 
SYSTEM WITH A CLOCK SIGNAL AND A SEPARATE PHASE SIGNAL 

BRIEF DESCRIPTION OF THE INVENTION 

This invention relates generally to digital systems. More particularly, this 
invention relates to a digital system in which a master device and a set of slave devices 
exchange data in response to a clock signal and a separate phase signal. 



Figure 1 A is a simplified illustration of a prior art synchronous bus system. 
The system is described in U.S. Patent 5,432,823, which is assigned to the assignee of 
the present invention, and is hereby incorporated by reference. 



and a set of slave devices 26_A through 26_N. A transmission channel is comprised 
of three components: a clock-to-master path 28, a turn-around path 29, and a clock- 
from-master path 30. The transmission channel ends at a termination block 31, which 
may be implemented with a resistor. Each clock pulse from the clock generator 22 
15 traverses from the clock-to-master path 28, through the turn-around path 29, through 
the clock-from-master path 30, and into the termination block 3 1 . The turn-around 
path 29 can be implemented as a single package pin to which the clock-to-master path 
28 and the clock-from-master path 30 are connected, provided that the created stub is 
relatively short. 

20 The clock generator 22 is any standard clock source. The master device 24 is a 

device that can communicate with other master devices and with slave devices, and is 
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The system 20 of Figure 1A includes a clock generator 22, a master device 24, 
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located near the turn-around path 29. By way of example, the master device 24 may 
be a microprocessor, a memory controller, or a peripheral controller. 

The slave devices 26 can only communicate with master devices and may be 
located any where along the transmission channel. The slave devices 26 may be 
5 implemented with high speed memories, bus transceivers, peripheral devices, or 
input/output ports. 

In the system of Figure 1A, a data/control bus 36 (sometimes referred to 
simply as a data bus 36) is used to transport data and control signals between the 
master device 24 and the slave devices 26_A through 26_N. This operation is timed 

10 by the clock signals on the transmission channel (28, 29, 30). More particularly, the 
master device 24 initiates an exchange of data by broadcasting an access request 
packet on the data bus 36. Each slave device 26 decodes the access request packet and 
determines whether it is the selected slave device and the type of "access requested. 
The selected device then responds appropriately, either reading or writing a packet of 

1 5 data in a pipelined fashion. 

In ^ e s y stem of Figure 1 A, the master device 24 transmits data on the bus 36 
contemporaneously with clock s gnals on the clock-from-master path 30. In other 
words, the transmission of data irom the master device 24 to the slave devices 26 is 
timed by the clock signals on the clock-from master path 30. Conversely, each slave 

20 device transmits data contempor meously with the clock signal on the clock-to-master 
path 28. That is, the transmissioi of data from the slave devices 26 to the master 
device 24 is timed by the clock signals on the clock-to-master path 28. The scheme of 
having clock and data signals travel in the same direction is used to reduce clock data 
skew. 

Figure IB illustrates timing circuitry used to coordinate the transmission and 
receipt of signals within a prifor art slave device 26. As shown, complementary clock 
signals CTM and CTM', respectively on lines 28A and 28B, are received at a 
differential input buffer 32, thetoutput of which is applied to the reference and phase 
offset inputs of a Delay-LockedlLoop (DLL) 33 to generate an internal transmit clock 
30 signal on line 34. Similarly, complementary clock signals CFM and CFM\ 

respectively on lines 30A and 30B, are received at a differential input buffer 35, the 
output of which is applied to the reference and phase inputs of a DLL 36 to generate an 
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internal receive clock signal on line 37. This differential buffering scheme is different 
than the non-differential (sii igle-ended) buffering scheme used for data reception. 
Thus, prior art slave devices using this configuration are susceptible to timing skew 
errors between the frequency signal and the data signal. 
5 Returning to Figure 1 A, a problem with this prior art system is that there are 

impedance discontinuities on the clock loop (28, 29, 30). These impedance 
discontinuities create standing waves on the clock loop. The standing waves cause a 
timing shift, which effectively changes the delay from one slave device to another, 
despite uniform spacing. This problem is more fully appreciated with reference to 
10 Figure 2. 

Figure 2 illustrates the clock signal delay as a function of a slave device's 
position from the master device. Line 40 illustrates the nominal delay increasing at a 
uniform rate as the distance from the master device grows. Line 50 illustrates the 
effective delay created by standing waves on the clock loop. Line 50 demonstrates 

15 that different slave devices receive a non-uniform delay, despite uniform spacing. 

This meandering delay causes timing problems in the reading and writing of data from 
and to the data bus 36. 

One approach to solving this problem is to incorporate calibration circuitry on 
each slave device 26. The problem with this solution is that it complicates the 

20 configuration of each slave device 26: it makes each slave device 26 more expensive 
and it results in increased power consumption at each slave device. 

Another disadvantage of prior art systems is that attenuation of the clock signal 
amplitude is difficult to remedy. Adding buffers or making multiple copies of the 
clock to reduce attenuation has the undesirable effect of distorting the clock signal 

25 phase. A related disadvantage of prior art systems is that the lengths of the clock 
signal traces coupled to a given slave device must usually match the length of the 
datapath between the master and the slave device, complicating system layout. 

Yet another disadvantage of prior art systems is that the distribution of the 
clock signal in a differential format, as shown in Figure IB, leads to differences 

30 between the clock and data reception circuitry within the slave device. These 
differences tend to produce undesirable timing skew between the clock and data. 
Accordingly, prior art systems either suffer the timing skew caused by differential 
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clock distribution or constrain the clock distribution circuitry to be the same as the 
data distribution circuitry, which usually necessitates a non-differential (single-ended) 
architecture. 

Accordingly, it would be highly desirable to provide an alternate mechanism 
5 for improving the timing performance in a master-slave system. 

SUMMARY OF THE INVENTION 

The apparatus of the invention is a master-slave system. The master-slave 
system includes a clock and phase signal generator to produce a clock signal at a given 

10 frequency and a phase signal at an "effective frequency," where the phase signal may 
or may not be periodic and has an "effective frequency" less than the given frequency. 
A clock line is connected to the clock and phase signal generator to carry the clock 
signal. A phase line is connected to the clock and phase signal generator to carry the 
phase signal. The phase line includes a phase-to-master path to carry a phase-to- 

1 5 master phase signal and a phase-from-master path to carry a phase-from-master phase 
signal. A master device is connected to the clock line and the phase line. A data bus 
is connected to the master device. A slave device is connected to the data bus, the 
clock line and the phase line. The slave device processes data on the data bus in 
response to the clock signal and the phase signal. 

20 The invention also includes a method of operating a master-slave system. The 

method includes the step of generating a clock signal at a given frequency and a phase 
signal at an "effective frequency" less than the given frequency. The phase signal 
includes a phase-to-master signal and a phase-from-master signal. The master device 
writes data in response to the clock signal and the phase-to-master signal. The master 

25 device reads data in response to the clock signal and the phase-from-master signal. 

The lower "effective frequency" phase signal does not produce standing waves 
which would introduce timing errors in the master-slave system. The phase signal is 
processed in the slave devices by known components that minimize expense and 
power consumption. Thus, the invention provides a low cost solution to improve 

30 timing performance in a master-slave system. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the invention, reference should be made to the 
following detailed description taken in conjunction with the accompanying drawings, 
in which: 

5 FIGURE 1 A illustrates a master-slave system constructed in accordance with 

the prior art. 

FIGURE IB illustrates a slave device constructed in accordance with the prior 

art. 

FIGURE 2 illustrates the prior art problem of meandering delay between a 
10 master device and a set of slave devices arising from standing waves on a clock loop. 

FIGURE 3A illustrates a master-slave system constructed in accordance with 
an embodiment of the invention. 

FIGURE 3B illustrates a master-slave system constructed in accordance with 
another embodiment of the invention 
15 FIGURE 4 illustrates a clock and phase signal generator constructed in 

accordance with an embodiment of the invention. 

FIGURE 5A illustrates a clock and phase signal generator constructed in 
accordance with another embodiment of the invention. 

FIGURE 5B illustrates a pseudo random number generator and an associated 
20 timing diagram for one embodiment of the invention. 

FIGURE 5C illustrates the timing diagram of a pseudo random number 
generator utilized in accordance with another embodiment of the invention. 

FIGURE 5D illustrates the timing diagram of a pseudo random number 
generator utilized in accordance with still another embodiment of the invention. 
25 FIGURE 6 illustrates a slave device constructed in accordance with an 

embodiment of the invention. 

Like reference numerals refer to corresponding parts throughout the drawings. 

DETAILED DESCRIPTION OF THE INVENTION 

30 Figure 3A illustrates a master-slave system 60 operated with a separate clock 

signal and phase signal in accordance with an embodiment of the invention. The 
system 60 includes a master device 62 (e.g., a memory controller) and a set of slave 
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devices 64A to 64_N (e.g., a set of memory devices): The system utilizes a clock and 
phase signal generator 70. As its name implies, the clock and phase signal generator 
70 produces a clock signal, which is applied to line 72, and a phase signal, which is 
applied to line 78. The clock and phase signal generator 70 produces two distinct 
5 signals: a clock signal and a phase signal. Thus, the clock and phase signal generator 
70 can be implemented with two distinct signal generating circuits: a clock generator 
and a phase reference generator. 

Observe in the prior art system of Figure 1 A that the clock generator 22 
generates a single clock signal, not the separate clock and phase signals produced by 
10 the clock and phase signal generator 70. In the prior art system of Figure 1 A, the 

clock signal on line 28 is used to coordinate the transmission and receipt of data. With 
the present invention, separate clock and phase signals are used to coordinate the 
transmission and receipt of data. 

A separate phase signal is used in accordance with the invention to avoid the 
15 prior art problem of standing waves created by high signal frequencies and impedance 
discontinuities. In one embodiment, the separate phase signal of the invention 
operates at a lower "effective frequency" than the clock signal on line 72. Herein, the 
phrase "effective frequency" refers to an average number of cycles per second 
computed over an extended (theoretically infinite) period of time. Thus, an effective 
20 frequency may be computed for both periodic and non-periodic signals. The lower 
effective frequency signal results in a phase line with negligible standing waves, 
thereby providing phase signals without timing shifts that would otherwise disrupt 
timing within the system. 

The phase signal of the invention may be a divided-down version of the clock 
25 signal. In that case, the phase signal can be a periodic signal. 

Line 72 is connected to a termination block 74, which may be implemented 
with a resistor. Line 72 is also connected to a clock node 76 of the master device 62. 
The phase line 78 has a turn around path at node 80 of the master device 62. The 
segment of the line 78 from the clock and phase signal generator 70 to the turn around 
30 path at node 80 forms a phase-to-master path, which carries a phase-to-master phase 
signal. The segment of line 78 from the turn around path at node 80 to the termination 
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block 82 (e.g., a termination resistor) forms a phase-from-master path, which carries a 

phase-from-master phase signal. 

Each slave device 64 has a clock node 84 connected to the clock line 72, a 

phase-to-master node 86 connected to the phase-to-master path of line 78, and a phase- 
5 from-master node 88 connected to the phase-from-master path of line 78. Observe that 

there are separate input nodes for the clock signal and the phase signals. In the prior 

art, phase and timing information is gathered solely from the clock-to-master and 

clock-from-master nodes. In contrast, with the present invention, phase and timing 

information is gathered from separate clock and phase nodes. 
10 As shown in Figure 3 A, each slave device 64 further includes one or more data 

nodes 94 to process data carried on data bus 90. Data bus 90 preferably includes a 

termination block 92. 

As discussed below, each slave device 64 uses the clock signal on node 84 and 

the phase-to-master signal on phase-to-master node 86 to transmit data to the master 
15 device 62 via data bus 90. Similarly, each slave device 64 uses the clock signal on 

node 84 and the phase-from-master signal to receive data via the data bus 90 from the 

master device 62. 

Figure 3 A illustrates phase reference/data path length matching of slave 
devices 64_A through 64_N. Observe in Figure 3A that the phase-to-master path 

20 between slave devices is a length "L", which is equivalent to the data bus length, "L M , 
between the same devices. Similarly, the phase-from-master path between slave 
devices is the length "L", which is equivalent to the data bus length between the same 
devices. The length of the clock signal path does not have to be matched to the length 
of the data bus, as is the case in prior art systems. Thus, the clock signal path can be 

25 flexibly routed in a variety of ways. 

Figure 3B illustrates differential signal buffers 73 to route a differential clock 
signal from the clock and phase signal generator 70. Thus, in accordance with the 
invention, a slave device can process a differential clock signal and a non-differential 
(single-ended) phase signal. This results in system design flexibility, as discussed 

30 below. In addition, the clock signal can be buffered, as desired, since skew between 
the clock signal and data signal does not have to be minimized, as is the case in prior 
art systems. 
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Figure 4 illustrates an implementation of the clock and phase signal generator 
70. The circuit 70 includes a clock source 100. The clock signal from the clock 
source 100 may be buffered at buffer 101 . In a differential signaling mode, an 
inverting node 102 of the buffer 101 is used to produce an inverted version of the 
clock signal. The clock signal and the inverted signal are applied to line 72 via nodes 
103. Thus, a clock signal generator of the invention includes a clock source 100, and 
optionally, a buffer 101. 



)><V& Figure 4 also illustrates 



» that the circuit 70 includes a divide-by -N circuit 1 04 to 
divide down the clock signal fr:>m the clock source 100. The divide-by-N circuit 104 
10 produces a lower frequency phase signal that will not produce standing waves on the 
phase line 78. The divide-by-N circuit may produce a non-fractional number (e.g., 
3.5). Thus, the divide-by-N circuit may be viewed as an M/N divider, where M and N 
are each integers. 

By way of example, in a 400 MHz system a clock cycle has a period of 2.5 ns. 

15 If a divide by 16 circuit is used, the phase signal has a period of 40 ns. A signal of this 
type will not produce standing waves on the phase line 78. Preferably, the lower 
frequency phase signal is processed with a buffer 1 06 prior to being applied to the 
phase line 78 via node 107. Thus, in one embodiment of the invention, a phase signal 
generator includes a clock source 100, a divide-by-N circuit 104, and a buffer 106. 

20 The clock and phase signal generator 70B of Figure 4 produces a phase signal 

that is periodic in nature. The remaining embodiments of the clock and phase signal 
generator produce a phase signal that is non-periodic. 

Figure 5A illustrates an alternate embodiment of the clock and phase signal 
generator 70. The circuit of Figure 5 A has the same components as the circuit of 

25 Figure 4 with the exception that the divide-by-N circuit 104 is replaced with a pseudo 
random number generator 110. The pseudo random number generator 1 10 generates a 
non-periodic or non-continuous spread spectrum signal. Thus, the phase signal 
produced by the generator of Figure 5 A does not have a single frequency. 
Nevertheless, this signal can be used as a reference signal to coordinate the receipt and 

30 transmission of data, as discussed below. The non-periodic or non-continuous signal 
is useful in various noise reduction or power saving schemes. 
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The concept behind the pseudo random number generator 1 10 is that so long as 
the phase reference cycles at least once every N cycles of the clock signal proper phase 
lock can be maintained by the slave delay-locked loops. Figures 5B, 5C, and 5D 
illustrate timing diagrams for three different embodiments of the pseudo random 
5 generator 110. 

Effective frequency can be explained with reference to Figure 5B. Pseudo 
random number generator 1 10A consists of 4 linear feedback shift registers 701-704 
(hereinafter LFSRs) and an dxclusive-OR (XOR) gate 700, with inputs from the LFSR 
703 and LFSR 704 and an output to the LFSR 701. The XOR gate 700 produces a 

1 0 digital high output when the outputs of flip-flops 703 and 704 are opposite, otherwise 
a digital low output signal is produced, this results in the following pattern at the 
output of flip-flop 701 : 100010©1 10101 1, as shown in Figure 5B. Thus, even with a 
periodic input clock signal, CLH, the output of the pseudo random number generator 
1 10A is not periodic. The outputtsignal provides only phase information, not clock 

1 5 information. Thus, it has an effective frequency, but is not periodic. 
*~y^£>ft >K y For the circuit 1 10A of Figure 5B, the number of registers/bits is preferably 
less than N, where N is the maximum number of clock cycles that can transpire 
without a transition of the phase reference without the loss of phase lock. The LFSR 
must have a transition at least once every K clock cycles, where K is the number of 

20 registers in the LFSR. Naturally, otlier circuits may be used to implement a pseudo- 
random number generator, as appreciated with reference to Figures 5C and 5D. 

Figure 5C illustrates timing diagrams for an alternate pseudo random generator 
used in accordance with an embodiment of the invention. In the scheme of Figure 5C, 
the pseudo random number generator is configured to output a new value M after each 

25 N cycles of the clock, CLK. The value M indicates the phase reference signal is to 

transition, e.g. from high to low, at the Mth cycle of the N cycles. The phase reference 
signal (Phase Ref) transitions, e.g. from low to high, after K cycles or at least before N 
cycles of CLK. In Figure 5C, K = 1 . So, if the pseudo random number generator 
randomly specifies a number between 1 and N (a new number M after every N cycles 

30 of the clock signal), phase lock is maintained using a non-periodic phase reference 
signal that can be described as having an effective frequency. Assuming complete 
randomness in the value of M, the effective frequency of the phase reference signal 
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converges to the clock frequency/N. With this approach, the effective frequency is the 
average number of cycles per second of the phase reference signal over an extended 
period of time. 

Figure 5D illustrates the use of a pseudo random number generator configured 
5 to output a value M used to divide the frequency of the clock signal, CLK. A new- 
value M is specified every N cycles. Preferably, M ranges randomly between 1 and 
N/2. In this embodiment, the signal has no fixed period over time intervals greater 
than N clock cycles, but does exhibit limited periodicity within selected intervals of N 
clock cycles. 

1 0 Figure 6 illustrates a slave device 64 constructed in accordance with an 

embodiment of the invention. The figure illustrates a clock node 84 to receive a 
single-ended or differential clock signal. The figure also illustrates a phase-to-master 
node 86 to receive a phase-to-master signal, which may be a single ended or 
differential signal. The slave device 64 also includes a phase-from-master node 88 to 

1 5 receive a phase-from-master signal, which may be a single ended or differential signal. 
Advantageously, the present invention allows single-ended phase signals. Prior art 
master-slave systems typically require that the phase information be acquired from the 
same buffer used to receive the clock signal. In some systems it is advantageous to 
use a differential mode clock even though this tends to produce skew between the 

20 clock input buffer and non-differential data input buffers. It is advantageous to design 
the phase input buffers to be as identical as possible to the data input buffers. The 
present invention provides this type of design flexibility. 

As shown in Figure 6, a first delay-locked loop 120 processes the clock signal 
and the phase-from-master signal to produce a receive clock signal, which is applied to 

25 receive circuitry 126. The receive circuitry 126, responsive to the receive clock signal, 
receives data from node 94, which is attached to the data bus 90 of Figure 3. In the 
case of a slave device that is a memory, the receive circuitry 126 routes the data to the 
memory core 130. 



30 master signal to produce a transmit clock signal, which is applied to transmit circuitry 
128. The transmit circuitry 128, responsive to the transmit clock signal, transmits data 



A second delay-locked loop 122 processes the clock signal and the phase-to- 
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to node 94, which is attached to the data bus 90. In the case of a slave device that is a 
memory, the transmit circuitry 128 routes the data from the memory core 130. 

Each delay-locked loop operates to align the received clock signal to the 
relevant phase signal for optimum data recovery and transmission. By using a random 
5 phase reference signal and a phase distribution line constructed to match the physical 
attributes of the data lines, the parasitic effects will match; therefore, the delay locked 
loop output will be automatically adjusted for optimum signaling margin. 

Those skilled in the art will recognize a number of advantages associated with 
the present invention; The lower effective frequency phase signal does not produce 

10 standing waves which would introduce timing errors in the master-slave system. The 
phase signal is processed in the slave devices by known components (such as a delay- 
locked loop) that minimize expense and power consumption. Thus, the invention 
provides a low cost solution to improve timing performance in a master-slave system. 

Observe that with the present invention, the clock line 72 may have an arbitrary 

15 topology since the clock phase is not relevant. This allows for more flexibility in 
system layout. Another advantage associated with the invention is that the clock can 
be buffered. Thus, with the present invention, the clock can be buffered and routed as 
desired. In many prior art master-slave systems, the clock and data signals must be 
constructed in the same manner. With the present invention, one may design a system 

20 to produce a clock signal with a larger amplitude than the phase reference signal. 

Another advantage of the present invention is the there is a great deal of design 
flexibility in the buffering circuitry of the slave devices. That is, different buffering 
schemes can be used for the clock signal, the phase signal, and the data signal. These 
buffering schemes can be matched or mis-matched, as desired. In contrast, in the prior 

25 art, the buffering for the clock and phase signals is unitary and therefore results in 

various design tradeoffs. For example, if the clock and phase signal buffering matches 
the data reception buffering, then a single-ended scheme must be used. If the clock 
and phase signal buffering does not match the data reception buffering, timing skew 
problems arise. This timing skew is avoided with the present invention by having the 

30 phase signal and data signal buffers in an identical configuration, while the clock 
signal buffering is optimized in a separate configuration, e.g., a differential signal 
buffering configuration. 
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The foregoing description, for purposes of explanation, used specific 
nomenclature to provide a thorough understanding of the invention. However, it will 
be apparent to one skilled in the art that the specific details are not required in order to 
practice the invention. In other instances, well known circuits and devices are shown 
5 in block diagram form in order to avoid unnecessary distraction from the underlying 
invention. Thus, the foregoing descriptions of specific embodiments of the present 
invention are presented for purposes of illustration and description. They are not 
intended to be exhaustive or to limit the invention to the precise forms disclosed, 
obviously many modifications and variations are possible in view of the above 
10 teachings. The embodiments were chosen and described in order to best explain the 
principles of the invention and its practical applications, to thereby enable others 
skilled in the art to best utilize the invention and various embodiments with various 
modifications as are suited to the particular use contemplated. It is intended that the 
scope of the invention be defined by the following claims and their equivalents. 
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